Variable-Rate Distributed Source Coding in the 
Presence of Byzantine Sensors 



Oliver Kosut and Lang Tong 
School of Electrical and Computer Engineering 
Cornell University, Ithaca, NY 14853 
Email: {oek2 , lt35}@cornell . edu 



Abstract — The distributed source coding problem is considered 
when the sensors, or encoders, are under Byzantine attack; that 
is, an unknown number of sensors have been reprogrammed by a 
malicious intruder to undermine the reconstruction at the fusion 
center. Three different forms of the problem are considered. The 
first is a variable-rate setup, in which the decoder adaptively 
chooses the rates at which the sensors transmit. An explicit 
characterization of the variable-rate minimum achievable sum 
rate is stated, given by the maximum entropy over the set of 
distributions indistinguishable from the true source distribution 
by the decoder. In addition, two forms of the fixed-rate problem 
are considered, one with deterministic coding and one with 
randomized coding. The achievable rate regions are given for 
both these problems, with a larger region achievable using 
randomized coding, though both are suboptimal compared to 
variable-rate coding. 

Index Terms — Distributed Source Coding. Byzantine Attack. 
Sensor Fusion. Network Security. 

I. Introduction 

Wireless sensor networks are vulnerable to various forms of 
attack. A malicious intruder could capture a sensor or a group 
of sensors and reprogram them, unbeknownst to the other 
sensors or the fusion center The intruder could reprogram the 
sensors to work cooperatively to obstruct or defeat the goal of 
the network, launching a so-called Byzantine attack. 

We refer to sensors that have been reprogrammed as traitors, 
and the rest, which will behave according to the specified 
procedure, as honest. Suppose there are m sensors and at most 
t traitors. Each time step, sensor i is informed of the value of 
the random variable Xi. These random variables constitute 
a discrete memoryless multiple source with probability dis- 
tribution p{xi ■ ■ ■ Xm). Each sensor encodes its observation 
independently and transmits the codewords to a common 
decoder (the fusion center), which attempts to reconstruct the 
source values with small probability of error based on those 
transmissions. If there are no traitors, Slepian-Wolf coding [1] 
can be used to achieve a sum rate as low as 

(1) 

However, standard Slepian-Wolf coding has no mechanism 
for handling any deviations from the agreed-upon encoding 
functions by the sensors. Even a random fault by a single 
sensor could have devastating consequences for the accuracy 
of the source estimates produced at the decoder, to say nothing 
of a Byzantine attack on multiple sensors. 



Consider a two sensor example. If sensor 1 transmits at rate 
H{Xi) and sensor 2 transmits at rate H{X2\Xi), their source 
sequences would normally be reconstructable using Slepian- 
Wolf. Since sensor 2 transmits at a rate below H{X2), the 
decoder must use the codeword from sensor 1 to decode X2- 
Thus, if sensor 1 is a traitor, it can manipulate the decoder's 
estimate of X2 to cause an error Generalizing this, it will turn 
out that for most source distributions, the sum rate given in 
^ cannot be achieved if there is even a single traitor We will 
present coding schemes that can handle Byzantine attacks, and 
give explicit characterizations of the achievable rates. 

A. Related Work 

The notion of Byzantine attack has its root in the Byzantine 
generals problem [2], [3] in which a clique of traitorous 
generals conspire to prevent loyal generals from forming 
consensus. It was shown in [2] that consensus is possible if 
and only if less then a third of the generals are traitors. 

Countering Byzantine attacks in communication networks 
has also been studied in the past by many authors. See the 
earlier work of Perlman [4] and also more recent review 
[5], [6]. An information theoretic network coding approach to 
Byzantine attack is presented in [7]. The problem of optimal 
Byzantine attack of sensor fusion for distributed detection is 
considered in [8]. Sensor fusion with Byzantine sensors was 
studied in [9]. In that paper, the sensors, having already agreed 
upon a message, communicate it to the fusion center over a 
discrete memoryless channel. Quite similar results were shown 
in [10], in which a malicious intruder takes control of a set 
of links in the network. The authors show that two nodes 
can communicate at a nonzero rate as long as less than half 
of the links between them are Byzantine. This is different 
from the current paper in that the transmitter chooses its 
messages, instead of relaying information received from an 
outside source, but some of the same approaches from [10] are 
used in the current paper, particularly the use of randomization 
to fool traitors that have already transmitted. 

B. Fixed-Rate Versus Variable-Rate Coding 

In standard multiterminal source coding, each sensor is 
associated with a rate and an encoding function that transmits 
information at that rate. We will show that this fixed-rate setup 
is suboptimal for this problem, in the sense that we can achieve 
lower sum rates using a variable-rate scheme. By variable-rate 



we mean that the number of bits transmitted per source value 
by a particular sensor will not be fixed. Instead, each sensor 
has a number of different encoding functions, each with its 
own rate. The coding session is then made up of a number of 
transactions. In each transaction, the decoder decides which 
sensor will transmit information, and which encoding function 
it should use. Thus we require that the decoder have a reverse 
channel to transmit information back to the sensors, but it need 
only send the chosen encoding function index, which will be 
one of a fixed and small number. In other words, the reverse 
channel could have arbitrarily small capacity. 

C. Honest Sensor Error Requirement 

Classical Slepian-Wolf coding requires that the decoder 
produce perfect estimates of every source value. However, 
this is no longer possible under Byzantine attack. A traitor 
could choose to send gibberish to the decoder, in which 
case the decoder could never correctly decode the associated 
source values. However, a traitor could also act exactly like 
an honest sensor, in which case the decoder would never 
be able to identify it as a traitor. Thus, the decoder will 
not necessarily be able to produce an accurate estimate for 
every sensor, but neither will it be able to tell which of 
its estimates are inaccurate. As a compromise, the decoder 
will produce an estimate for every source value, but we only 
require that the estimates corresponding to the honest sensors 
are correct, even though the decoder may not know which 
those are. This requirement is reminiscent of that of [2], in 
which the lieutenants need only perform the order given by 
the commander if the commander is not a traitor, even though 
the lieutenants might not know whether he is. 

D. Main Results 

The main results of this paper give explicit characterizations 
of the achievable rates for three different setups. The first, 
discussed in the most depth, is the variable-rate case, for which 
we give the minimum achievable sum rate. By definition, 
variable-rate coding involves varying the rates at which differ- 
ent sensors transmit. The choice of these rates will be based 
on "run time" events such as the source values and the actions 
of the traitors. Thus, there is no notion of an rn-dimensional 
achievable rate region, since all we can say is that, no matter 
what happens, the total number of transmitted bits will not 
exceed a certain value. The second two setups are fixed-rate, 
divided into deterministic coding and randomized coding, for 
which we do give m-dimensional achievable rate regions. We 
show that randomized coding yields a larger achievable rate 
region than deterministic coding, but we believe that in most 
cases randomized fixed-rate coding requires an unrealistic 
assumption. In addition, even randomized fixed-rate coding 
cannot achieve the same sum rates as variable-rate coding. 

For variable-rate coding, the minimum achievable sum rate 
is given by 

supiJ,(Xi---X„) (2) 



where Hq is the entropy with respect to the distribution q and 
Q is a set of distributions which depends on t, the number 
of allowed traitors. The explicit definition of Q is given later, 
but intuitively Q is the set of distributions such that if we 
simulated any distribution q ^ Q and handed the resulting 
source sequences to the decoder as if they had come from the 
sensors, then it would not be able to correctly identify a single 
traitor For example, the source distribution p is always in Q, 
because if the decoder receives source sequences that appear 
to come from the true distribution, it will not be able to know 
which sensors are the traitors. In fact, if i = 0, Q is made up 
of only the source distribution p, so ^ becomes ([T]). In other 
words, this result matches the classical Slepian-Wolf result. 

On the other hand, ift — to— 1, then the decoder knows only 
that the one honest sensor will report source values distributed 
according to its single variable marginal distribution, so a 
traitor will not be detected if it also reports source values 
distributed according to its marginal distribution. Hence q ^ Q 
if q{xi) = p{xi) for all i. It is easy to see that (|2]i becomes 

H{X^) + --- + H{X^). (3) 

In effect, the decoder must use an independent source code 
for each sensor 

The fixed-rate achievable regions are based on the Slepian- 
Wolf achievable region. For randomized coding, the achievable 
region is such that for every subset of m~t sensors, the rates 
associated with those sensors fall into the Slepian-Wolf rate 
region on the corresponding m — t random variables. Note 
that for t = 0, this is identical to the Slepian-Wolf region. For 
i = TO — 1, this region is such that for all i, Ri > H{Xi), 
which corresponds to the sum rate in (O. The deterministic 
region is similar, except that every subset of to — 2i rates is 
required to fall into the corresponding Slepian-Wolf region. 

E. Randomization 

Randomization plays a key role in defeating Byzantine 
attacks. As we have discussed, allowing randomized encoding 
in the fixed-rate situation expands the achievable region. In ad- 
dition, the variable-rate coding scheme that we propose relies 
heavily on randomization to achieve small probability of error. 
In both fixed and variable-rate coding, randomization is used 
as follows. Every time a sensor transmits, it randomly chooses 
from a group of essentially identical encoding functions. The 
index of the chosen function is transmitted to the decoder 
along with its output. Without this randomization, a traitor 
that transmits before an honest sensor i would know exactly 
the messages that sensor i will send. In particular, it would be 
able to find fake sequences for sensor i that would produce 
those same messages. If the traitor tailors the messages it 
sends to the decoder to match one of those fake sequences, 
when sensor i then transmits, it would appear to corroborate 
this fake sequence, causing an error By randomizing the 
choice of encoding function, the set of sequences producing 
the same message is not fixed, so a traitor can no longer 
know with certainty that a particular fake source sequence 
will result in the same messages by sensor i as the true 



one. This is not unlike Wyner's wiretap channel [11], in 
which information is kept from the wiretapper by introducing 
additional randomness. 

In both variable-rate and randomized fixed-rate coding, 
we assume that the traitors know nothing about randomness 
produced at an honest sensor. Of course, after the randomness 
has been transmitted, the traitors should have access to that 
information, which is what we assume in the variable-rate 
case. However, for the fixed-rate setup, there is no notion 
of a transmission order, so it would be meaningless to say 
that the traitors only know about the randomness "after" it 
has been transmitted. The only choice is to assume that the 
traitors never find out anything about the randomness. This 
might be a realistic assumption if the traitors are not able to 
monitor transmissions to the decoder, but we believe that in 
most cases it is not. Hence deterministic fixed-rate coding is 
more realistic. 

The rest of the paper is organized as follows. In Section HH 
we formally give the variable-rate model and present the 
main result of the paper, which we prove in Section HUl In 
Section HV] we give the rate regions for the fixed-rate setups 
and illustrate that fixed-rate coding is suboptimal. Finally, in 
Section |V] we offer some future avenues for research. 

II. Variable-Rate Model and Result 

A. Notation 

Let Xi be the random variable revealed to sensor i, Xi the 
alphabet of that variable, and Xi the coiTesponding realization. 
A sequence of random variables revealed to sensor i over n 
timeslots is denoted X", and a realization of it a;" £ Xf. 
Let M = {1, . . . , m}. For a set s C M, let be the set of 
random variables {Xi}i^s, and define Xg and Xg similarly. By 
s'^ we mean M\s. Let T^'-{Xs)[q] be the strongly typical set 
with respect to the distribution q, or the source distribution p 
if unspecified. Similarly, Hq{Xs) is the entropy with respect 
to the distribution q, or p if unspecified. All variations on e, 
such as e', e", e, are assumed to go to as e goes to and may 
appear without definition. It is meant that either the definition 
is discernible from context or the existence will be shown. 

B. Communication Protocol 

The transmission protocol is composed of L transactions. 
In each transaction, the decoder selects a sensor to receive 
information from and selects which of K encoding functions 
it should use. The sensor then responds by executing that 
encoding function and transmitting its output back to the 
decoder. For each sensor i G M and encoding function 
j G {1, . . . , if}, there is an associated rate Ri,j. On the /th 
transaction, let ii and ji be the sensor and encoding function 
chosen by the decoder, and let hi be the number of times ii 
has transmitted prior to the /th transaction. Note that hi 
are random variables, since they are chosen by the decoder 
based on messages it has received, which depend on the source 
values. The jxh encoding function for the ith sensor is given 
by 

/.,,:XrxZx {!,..., X}'''^{1,...,2"«-^} 



where Z represents randomness generated at the sensor. Let 
/( G {!,..., 2"^' J } be the message received by the encoder 
in the /th transaction. If ii is an honest sensor, then /; = 
fi^,j^{X", pi^, Ji), where e Z is the randomness from 
sensor ii and J; G {1, . . . , if is the history of encoding 
functions used by sensor ii so far. If ii is a traitor, however, it 
may choose Ii based on all sources X", . . . , X^, all previous 
transmissions Ji, . . . , and polling history ii, . . . , and 
ji,...,ji-i. In particular, it does not have access to the 
randomness pi for any honest sensor i. 

After the decoder receives if / < i it uses /i, . . . , /; to 
choose the next sensor ij+i and its encoding function index 
ji+i- After the Lth transaction, it decodes according to the 
decoding function 

L 

1=1 

C. Variable-Rate Problem Statement and Main Result 

Let !K C M be the set of honest sensors. Define the proba- 
bihty of error P, = Pr{X^ / XJ^) where (Xf , . . . , X^^) = 
g{Ii, . . . , II)- This will in general depend on the actions of the 
traitors. Note again that the only source estimates that matter 
are those coiTesponding to the honest sensors. 

We define a sum rate R to be e-achievable if for every 5 > Q 
and sufficiently large n there exists a code such that, for any 
choice of actions by the traitors, Pe < e and 

L 

Y^R,,,,,<R + 6. (4) 

1=1 

Note that .j, depend on the sensor transmissions, so they 
are random variables. By (|4]i we mean that for any messages 
sent by the sensors, we never exceed a sum rate of i? + (5. A 
sum rate R is achievable if it is e-achievable for every e > 0. 
Let R* be the minimum achievable sum rate. Certainly then 
all R > R* aie also achievable. 

Some definitions will allow us to state our main result. Let 

V = {s C M : |s| = m - t}. 

This is the collection of all possible sets of honest sensors. 
For any V cV, define 

Q{V) ^ {qixi ■ ■ ■ x.ra) : Vs G V, q{x,) - p{x,)}. (5) 

Let U{V) = U^ev «■ Finally, define 

Q= U 

VGV:U(V)=M 

That is, Q is the set of distributions q such that for each i, 
there is a marginal distribution of q of m—t variables including 
Xi that matches the coiTesponding marginal distribution of p. 
Thus, those m — t sensors behave as if they were the set of 
honest sensors, since their sources are distributed correctly. 
Since every i falls into such a set, every sensor looks like it 
could be honest. 



Theorem 1: The minimum achievable sum rate is 

i?* = supff,(Xi---X„). (6) 
It can be shown that for t = \ and arbitrary m, (|6Jl becomes 

R* = H{Xx ■■■X^) + max /(X,; X,, ,,,}c). (7) 

i.i'eM 

Relative to the Slepian-Wolf result, we see that we always pay 
a conditional mutual information penalty for a single traitor 
Similar expressions can be found for t — 2, t — m — 2, and 
t = m— 1 (the last given by (|3]l). However, analytic expressions 
do not in general exist for 3 < t < m — 3. 

III. Proof of the Variable-Rate Theorem 

A. Converse 

We first show the converse. Let q be the distribution q that 
maximizes the entropy in (|6]l. For some s with \s\ = in — 
t, we can write q — p{xs)q{xs'=\xs)- Thus if the s'^ sensors 
are the traitors, they can simulate the conditional distribution 
q{xs'=\xs), the outcome of which, when combined with the 
true values of Xg, will produce a set of Xi ■ ■ ■ X,n distributed 
according to q. Since q g Q, if the traitors act honestly with 
these fabricated source values, the decoder will not be able 
to correctly identify a single traitor, so it has no choice but 
to perfectly decode every value. To do this, it must receive at 
least nHq{X bits, which means R* > Hq{Xj^^). 

B. Achievability Preliminaries 

Now we prove achievability. To do so, we will need the 
following definitions. For some V dV, let 

where T" is the strongly typical set. For s, s' C M and a;" G 
X"/, we define the conditional version 

{X s\x'^i)\V] = {x" G X" : 3a;"^,j^,-)c G X"gus')'^ • 

{«-^Us-y)^s:{x^)[v]]. 

The following lemma shows that 5" is contained in a union 
of typical sets. 

Lemma 1: Fix s, s' C M and a;" G X"/. Then 

s:{Xs\x:,)[v]c u T:,{x^x:,)[q]. 

qeQiV) 

C. Coding Scheme Procedure 

We propose a multiround coding scheme. Each round is 
made up of m phases. In the ith phase, transactions are made 
entirely with sensor i. In addition, all transactions in the first 
round are based on the first k source values, transactions in 
the second round on the second k source values, and so on. 
Each transaction in the ith phase will be associated with a 
target set chosen by the decoder of the form 

Tnix';)^ U T^^iX.lxlM (8) 

q:H,{X,\X,)<B. 

with s C M to be defined, and e' is as defined in Lemma [T] It 
takes about kR bits to encode any sequence in this set, so we 



can think of Tjj{x^) as the set of all the sequences that can be 
decoded if a sensor has only sent kR bits so far in the current 
phase. The strategy will be to slowly increase R, expanding 
Tii{x^) until it contains the relevant source sequence. 

The decoder will attempt to determine whether the source 
sequence is contained in T/j(x^'), and if so to decode it. Sensor 
i will randomly choose from a number of encoding functions 
fi, . . . , fc- Each of these encoding functions will be created 
by means of a random binning procedure and the codebooks 
revealed to both the sensor and decoder. Sensor i will transmit 
up to k{R+e) bits containing the index of the randomly chosen 
encoding function and its output. If there is exactly one source 
sequence in the target set that matches every value received 
so far from sensor i in this round, call it x^. If there is more 
than one such sequence, we declare an error. If there is no 
such sequence, we conclude that the source sequence is not 
contained in the target set, increase R by e, and do another 
transaction. Note that when R > log |Xi|, every sequence will 
be in Tr{x';), so we will definitely decode the sequence or 
declare an error 

The collection V C V will always contain only those sets 
that could be the set of honest sensors. We begin by setting 
V — V, and pare it down after each round based on new 
information. Define s,; ^ {1, . . . ,i} D U{V). Phase i of any 
round is made up of the following steps. 

1) If i ^ U{V), ignore i and go to the next phase. 

2) Otherwise, let R ~ e. 

3) Receive up to k{R + e) bits from sensor i, with target 
set Tjj{xsi_i)- If possible, decode the sequence to if 
and go to the next phase. If not, increase R hy e and 
repeat. 

4) After phase to, let V' G V he the largest subset of V 
such that xu(v) e Sl'{Xu(v))[V']. Use V as V in the 
next round. If there is no such V', declare an error. 

D. Code Rate 

It can be shown that the probability of error can made 
arbitrarily small if C, the number of encoding functions from 
which each sensor chooses randomly during each transaction, 
is sufficiently large. We can then make k large enough that 
transmitting the index of the chosen encoding function takes 
negligible rate compared to transmitting its output. Thus in 
each phase we need only transmit R + e bits per symbol. Let 
qx be the type of x^i^yy The total number of bits sent per 
symbol for the entire round is therefore at most 



E 



inf_ Hg{X,\Xs,_,) + e + e 



q:x^eT^,(Xi\x'^,__^)[q] 

m 

< inf y Hq{X,\XsJ+m{e + e) (9) 

<Hg,{Xuiv))+m{^ + ^) (10) 

< sup Hg{Xuiv))+e (11) 

< sup Hq{X-M) + log \'Xu(v)\u(V') I + e 
9eQ 



(12) 



where (|9|l holds because the set of distributions q such 
that i^. e T1^{Xsi)[q\ contains the set of distributions q 
such that G TJ,'(X[/(-y-))[g], and ( fTOb holds because 

2^c/(y) is typical with respect to its own type. Because 
xu{v) e S'l{Xu(y))[V'], by Lemma [T] for some q G 
i(7(y) e TJ}(Xc/(v))[g]. For this g, for all xu(v) e X(7(y), 
\lx{xu{v)) — li?^u{v))\ < I I ■ Since the distributions are 
arbitrarily close, the entropies with respect to these distribu- 
tions will be arbitrarily close, so (fTTT i holds. 

If 'U{y') = U(y), then the second term in O is 0, 
so we can bound (fT2l) by sup^^g Hq{XM:) + e. However, if 
U{V)\U{V') 7^ 0, we cannot. Even so, since at least one 
sensor is eliminated whenever U{V)\U{V') ^ 0, this can 
only happen for at most t rounds, after which we will have 
eliminated every traitor. Thus with enough rounds, we can 
always bound the sum rate by sup^gg Hq{Xj^) + e. 

IV. Fixed-Rate Results 
Consider an m-tuple of rates . . . , i?„i), encoding func- 
tions fi : X2 ^ {l,---,2"^'} for i G M, and decoding 
function 

m 

Let e {1, . . . , 2"^*} be the message transmitted by sensor 
i. If sensor i is honest, — fi{Xl'-). If it is a traitor, 
it may choose li arbitrarily, based on all the sources X^. 
Define the probability of error = Pr {X^ ^ ^Jc) where 

{X^,...,X:^)=g{h,...,lL). 

We say an m-tuple {Ri, . . . , Rm) is deterministic-fixed- rate 
achievable if for any e > and sufficiently large n, there 
exist coding functions fi and g such that, for any choice of 
actions by the traitors, Pe < e. Let 3?dfi- C M'" be the set of 
deterministic-fixed-rate achievable m-tuples. 

Define an m-tuple to be randomized- fixed- rate achievable 
in the same way as above, except we allow the encoding 
functions fi to be randomized. Let 3?ifr C M™ be the set of 
randomized-fixed-rate achievable rate vectors. 

For any s C M, let SW(Xs) be the Slepian-Wolf rate region 
for the random variables Xg. For any integer k < m, define 

Jik = {{Ri • • • i?™) : Vs C M, |s| = fc : e SW(X,)}. 

The following theorem gives the rate regions explicitly. 
Theorem 2: The fixed-rate achievable regions are given by 

3^dfi- — 3^max{l,m-2t} D?,fr = 3?m-t- 

We omit the proof of this, but we briefly illustrate that 
circumstances exist for which fixed-rate coding is suboptimal 
compared to variable-rate coding. Suppose m = 3 and t = \. 
Recall from (|7]i that the variable-rate minimum achievable sum 
rate is given by 

R* ^ H{XiX2X:i) + m&y.{I{Xi-X2\X-i), 

/(Xi;X3|X2),/(X2;X3|Xi)}. (13) 

Suppose that I{Xi \ achieves this maximum. If the rate 

triple {Ri, R2, R3) is randomized fixed-rate achievable, then 



(i?i,i?2,i?3) e 3^2, which means R, + Rj > H{X,Xj) for 
aU ij e {1,2,3}. Thus 

Ri+R2+R3>l [H{X,X2) + HiX.Xs) + HiX^Xs)] 

= HiXiX2X3) + i[/(Xi;X2|X3) +/(XiX2;X3)]. (14) 

If IiXiX2;X3) > I{Xi;X2\X3), ^ is larger than (O. 
Hence, for some source distributions, a larger sum rate is 
required for fixed-rate coding than variable-rate coding. 

V. Future Work 

Much more work could be done in the area of Byzantine 
network source coding. In this paper, we assumed that the 
traitors have access to all the source values, an assumption 
that was vital in our converse proofs. This is a significant 
assumption that may not be all that realistic. It would be 
worthwhile, though perhaps more difficult, to characterize the 
achievable rate region without this assumption, assuming that 
the traitors have access only to their own source values, or 
possibly degraded versions of those of the honest sensors. 

Finally, we could consider Byzantine attacks on other sorts 
of multi-terminal source coding problems, such as the rate 
distortion problem [12], [13] or the CEO problem [14]. 
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